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ABSTRACT 



A system for upgrading software has multiple clients 
coupled to an upgrade server. The clients store an old version 
of software. The upgrade server stores both the old version 
of software and a new version of software. The upgrade 
server creates an upgrade file from the old and new versions 
of the software such that the upgrade file is smaller than the 
new version. The upgrade server compares old character 
strings from the old version with new character strings from 
the new version to identify matching sections. The upgrade 
server derives a two-dimensional table containing multiple 
entries, whereby each entry represents a length of a longest 
common substring beginning at a first position in the old 
character string and at a second position in the new character 
string. The upgrade server then ascertains the longest com- 
mon substring from the table. The upgrade server inserts 
headers into the upgrade file to distinguish between match- 
ing and non-matching sections. For matching sections, only 
the header is included and the section is omitted. The clients 
receive the upgrade file and begin processing the file to 
reconstruct the new version of software from the new 
sections included in the upgrade file and from the matching 
sections obtained locally from the stored old version of 
software. 

42 Claims, 10 Drawing Sheets 
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SYSTEM AND METHOD FOR UPGRADING 
CLIENT SOFTWARE 

TECHNICAL FIELD 

This invention relates to distributed client-server systems 
and methods for upgrading client software from an upgrade 
server. 

BACKGROUND 

In traditional client-server systems, the server upgrades 
software on the client by transferring a new version of the 
program. The client is equipped with adequate memory 
resources to store both the old and new versions of the 
program. When the new version is present, the client informs 
the user that an upgrade is available and gives the user an 
opportunity to upgrade to the new version. If the user agrees, 
the old version is renamed out of the way and the new 
version is renamed to the default name used by the client 
when booting up or calling the program. 

With the advent of alternative client products having 
limited processing capabilities and memory, this traditional 
model of upgrading software on the client cannot be used 
because the client is unable to store the entire new version 
of software. These scaled down or "thin" clients are typi- 
cally constructed with just enough functionality to enable 
access to the server computer over a network. The thin client 
is typically able to store one version of the software, plus a 
little more. Examples of thin clients include low cost com- 
puters known as "network computers" or "NCs" and tele- 
vision set-top boxes (STBs). 

This invention concerns a method for upgrading software 
on thin clients, although the method can be applied in other 
server-client contexts that employ general-purpose comput- 
ing clients. 

SUMMARY 

This invention concerns a system for upgrading software 
in a client-server architecture. The system has multiple 
clients coupled to an upgrade server. The clients have 
limited processing and storage capabilities. Examples of 
such clients include network computers, set-top boxes, por- 
table information devices, and so forth. The clients store an 
old version of software, such as in a flash memory. 

The upgrade server has a processor and a memory. The 
upgrade server stores both the old version of software and a 
new version of software. The upgrade server runs an 
upgrade program that creates an upgrade file from the old 
and new versions of software such that the upgrade file is 
much smaller than the new version. In the compressed 
upgrade file, the upgrade server distinguishes between 
matching sections that match in both the old and new 
versions from non-matching sections that are present only in 
the new version with no counterpart in the old version. 

The upgrade server identifies the matching sections by 
comparing an old character string (or any arbitrary string of 
bytes) from the old version with a new character string (or 
any arbitrary string of bytes) from the new version. The 
upgrade server finds common substrings in the two character 
strings. In one implementation, the upgrade server derives a 
two-dimensional table containing multiple entries, whereby 
each entry represents a length of a longest common sub- 
string beginning at a first position in the old character string 
and at a second position in the new character string. The 
upgrade server then ascertains the longest common substring 
from the table. 
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For matching sections, the upgrade server creates pointer 
headers that identify the sections in the old version that 
match sections in the new version. The upgrade server 
inserts the pointer headers into the upgrade file in lieu of the 

5 matching sections. For non-matching sections, the upgrade 
server creates data headers and inserts them and their 
corresponding non-matching sections from the new version 
into the upgrade file. The data headers indicate that the 
accompanying sections contain new data. 

1° The upgrade server transfers the upgrade file, which is a 
compressed form of the new version of software. The client 
receives the upgrade file and begins processing the file to 
reconstruct the new version of software from the upgrade 
file and the old version stored locally. Upon reaching a data 

35 header, the client adds the new section from the new version. 
Upon reaching a pointer header, the client copies the com- 
mon substring from the old version into the recreated new 
version. After the entire upgrade file is processed, the client 
possesses the new version of the software. The client can 

20 then inform the user, and upon reboot, begin operation using 
the new software version. 

BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 illustrates a client-server system in which a server 
25 is configured to upgrade software on a client. 

FIG. 2 is a block diagram of functiooal components in the 
client. 

FIG. 3 is a block diagram of functional components in the 
30 server. 

FIG. 4 is a flowchart showing steps in a method for 
creating an upgrade file from old and new versions of 
software. 

FIGS. Sa-5c show a table constructed by the server to find 
35 longest common substrings in both the old and new versions 
of software. The table is shown at different levels of comple- 
tion in the three figures. 

FIG. 6 is a flowchart showing steps in a method for 
reconstructing the new version of software from the old 
40 version and the upgrade file formed by the steps in FIG. 4. 
FIG. 7 is a flowchart showing steps in another method for 
creating an upgrade file from old and new versions of 
software by using a hashing table. 
45 FIGS. 8a and 8b present a flowchart showing steps in a 
method for creating an upgrade file from old and new 
versions of software that are written in a specific file 
structure in which parts of the structure are compressed. 
FIG. 9 is a flowchart showing steps in a method for 
50 reconstructing the new version of specific file structure from 
the old version and the upgrade file formed by the steps in 
FIGS. 8a and 8b. 

DETAILED DESCRIPTION 

55 This invention concerns a system for upgrading software 
in a client-server architecture. The invention is described 
generally in the context of thin clients, although aspects of 
this invention may be implemented in other client-server 
environments that do not use thin clients. 

60 System Overview 

FIG. 1 shows a client-server system 20 having a client 22 
connected to an upgrade server 24 via a network 26. The 
system 20 is representative of many different network 
systems, involving many diverse types of clients and a wide 

65 variety of networks including both wire-based and wireless 
technologies. For instance, the system 20 might be an 
Internet-based system in which the client and server are 
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interconnected via the Internet, and the upgrade server 
transfers an upgrade file over the Internet to the client. The 
client 22 and server 24 connect to the Internet via conven- 
tional means, such as a modem, network connection, 
through an Internet Service Provider (ISP), and so forth. In 
this context, the client might be a computer, a thin client, a 
set -top box, or an information appliance. 

As another example, the system 20 is representative of a 
television system in which the client and upgrade server are 
interconnected via a television distribution network, such as 
cable, RF, microwave, and satellite. In this context, the client 
22 includes a set-top box and the upgrade server downloads 
an upgrade file to the set-lop box via the TV distribution 
network. 

As another example, the system 20 is representative of a 
system for programming portable devices in which the 
upgrade server transmits an upgrade file to a portable 
information device via a wire or wireless link. Examples of 
portable information devices include personal organizers, 
palm-size computers, cellular phones, programmable 
watches, pagers, and so forth. One particular example 
involving portable information devices is described in 
co-pending U.S. patent application Ser. No. 08/394,659, 
entitled "System and Method for Remotely Managing 
Memory in a Portable Information Device from an External 
Computer, " which was filed Feb. 22, 1995. This application 
is assigned to Microsoft Corporation and is incorporated by 
reference. 

The client 22 is preferably a thin client having enough 
processing and storage capabilities to store and run an 
operating system 30 and a program 32. Examples of pro- 
grams stored on the client include a Web browser, an 
electronic programming guide, a personal scheduler, and so 
forth. The client 22 is typically not equipped with additional 
storage resources to store multiple programs or multiple 
copies of one program or a means for a user to load new 
software. As a result, the upgrade server periodically trans- 
fers new versions of the program in real-time to replace the 
old version of the program currently executing at the client. 

'Ilie upgrade server 24 stores both the old version 32 of 
the program and a new version 34. The upgrade server runs 
an upgrade program 36 that creates an upgrade file 38 from 
the old and new versions. The upgrade file 38 is smaller than 
the new version 34, but can be used by the client to 
reconstruct the new version from the old version. 

The upgrade program 36 treats the versions as images of 
raw data or strings of characters (i.e., numbers, letters, 
symbols, etc.), rather than lines of code. The upgrade 
program 36 compares the two images and distinguishes 
between sections that match in both versions from sections 
appearing only in the new version but having no counterpart 
in the old version. 

In the upgrade file 38, the matching sections are replaced 
by "pointer headers" in lieu of the sections themselves. Each 
pointer header contains information to locate the associated 
section in the old version that is locally stored on the client. 
The client uses the pointer header to locate and copy the 
section when reconstructing the new version. Thus, the 
section need not be downloaded from the upgrade server. In 
FIG. 1, the matching sections A and C in the old and new 
versions 32 and 34 are replaced in the upgrade file 38 with 
pointer headers PH. 

A second header, referred to as a "data header", demar- 
cates each of the non-matching sections in the upgrade file. 
The non -matching sections are inserted following corre- 
sponding data headers. The data headers specify the size of 
the following data sections. In FIG. 1, the non -matching 
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section D in the new version 34 is added to the update file 
beneath a corresponding data header DH. 

The upgrade program 36 transfers the upgrade file 38 to 
the client 22. Since it is likely that only a small amount of 

5 the code has actually changed from the old version to the 
new version, large portions of the new version need not be 
downloaded. The pointer headers convey essentially the 
same information to the client. Accordingly, the upgrade file 
38 is likely to be substantially smaller than the new version 

10 of the software, thereby enabling a more rapid real-time 
download of the file in comparison to downloading the 
entire new version. 

The operating system 30 at the client 22 is capable of 
processing the upgrade file 38 to recoaslruct the new version 

is of the program. Upon reaching a data header, the client 22 
adds the new section contained in the upgrade file 38 to the 
reconstructed program. Upon reaching a pointer header, the 
client 22 copies the matching section from the old version 32 
into the reconstructed program. After the entire upgrade file 

20 is processed, the client has the new version of the program. 
Exemplary Client 

FIG. 2 shows the client 22 implemented as a set -top box 
according to one exemplary implementation of this inven- 
tion. The client 22 has a central processing unit (CPU) 50 

25 coupled to an application-specific integrated circuit (ASIC) 
52. The ASIC 52 contains logic circuitry, bussing circuitry, 
and a video controller. 

The client 22 has a Random Access Memory (RAM) 54, 
a Read Only Memory (ROM) 56, and a flash memory 58 

30 coupled to the ASIC 52. RAM 54 temporarily stores the 
upgrade file 38 and the new program as it is being recon- 
structed. ROM 56 stores the operating system 30. The flash 
memory 58 stores the program 32 (i.e., browser software, 
electronic programming guide, etc.) that is periodically 

35 upgraded. The flash memory 58 initially stores the old 
version of the program, but following completion of the new 
version, replaces the old version with the reconstructed new 
version. 

The client 22 has a video input 60 to receive television 

40 signals that are passed through the set-top box to the 
television set. The client also has a network connection 62 
(e.g., modem) to provide connection to the network 26 and 
to communication to the upgrade server. Other components 
of a set-top box — an IR interface, a television decoder, an 

45 audio digital-to-analog converter, and the like — arc not 
shown for simplicity purposes. 
Exemplary Upgrade Server 

FIG. 3 shows an exemplary implementation of an upgrade 
server 24. It has a processing unit 72, memory 74 (e.g., 

50 RAM, ROM, flash, floppy disk, hard disk, CD-ROM, disk 
array, etc.), and a network connection 76 to provide access 
to the network 24. The server 24 may optionally be equipped 
with one or more input devices 78 (e.g., keyboard, mouse, 
track ball, touch panel screen, etc.) and a display 80. 

55 The upgrade server runs an operating system 82 that is 
stored in memory 74 and executed on the processing unit 72. 
As an example, the operating system 82 may be the Win- 
dows NT operating system from Microsoft Corporation, or 
a Unix-based operating system. 

60 The upgrade server 24 stores the old program version 32 
and the new program version 34 in memory 74. The upgrade 
server runs an upgrade program 36, which is stored in 
memory 74 and executed on the processing unit 72, to create 
an upgrade file 38 from images of the old and new versions 

65 of the software. The upgrade program 36 has a substring 
matching module 84 that finds common character substrings 
in the two program versions to identify the matching sec- 
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tions in the two images. The substring matching module 84 for j»length(s2) down to 1 

identifies the common substrings regardless of their respec- if sl[i]=s2[j] 

tive locations within the program versions. The upgrade substr[i][j]=substr[i+l][j+l]+l; 

program 36 may also have a hashing algorithm 86 that is 

capable of hashing one or both program versions and 5 

constructing a bash table, as a result. Use of the hashing substr[i][j]~0; 

algorithm is described below under the section beading 

"Modified Upgrade Process". nt proccss a two-dimensional table having 

Upgrade Process multiple entries substr[i]j]. Each entry, substr[i]Q], is the 

A method for upgrading software in the client remotely 10 of the Iongest common substring beginning at posi- 

from the upgrade server involves two phases: (1) creating, at Uon „.„ - m ^ sl afid position in slriQg ^ 

the upgrade server an upgrade file from images of the old Tq demonslrate lhis 0CCSSi s string sl is a short 

and new versions of the software, and (2) reconstructing at chmcter stri « cling » and string s2 is a shorl characte r 

the cl.enl, the new version of the software from the upgrade ^ Notfcc ^ lhcsc tWQ characlef strings ^ 

' ...... r L , 35 the middle three letters "lin". The process begins at the last 

Three different implementations of the upgrade process characters jn , he stri .. clin - aod - linr> and works back 

arc desenbed below. A general upgrade proccss is described towards ^ b innin cbaracters . 

first. The general case utilize a basic compression technique Fof fi(st ^ Q „.„ aQd u . afe x{ 

that can be applied to any two data files that are expected to (f) five ^ sl[j] ^ in „ di amJ , he 

contain sim.lant.es. Following this discussion is a descnp- 20 tem ^ referenc< ^ J lne . r jn lin ,... ^ characters do 

t.on of a modified upgrade process that involves use of a ^ ^ aQd hence fc decremented , 0 f chan ^ 

hashing table to improve the speed of the basic compression ^ term tf m tQ reference ^ „ n „ jn „ ^ A fa , here 

algorithm . The third case is more specifically directed to use . . .. f - J?--, A , 

° , r j vti^ is no match. The process continues for i=3 (i.e., 52131=1), and 

of a specific data structure of image files, referred to as NK (hcn for £ ^ Nq match Qccms ^ , hc 

(new kernel) image files, and to improvements in the 25 aion j is t0 onc , al which thc tcrm 

upgrade process tailored to these files. Inese cases are >, rn i c ,u « » • « i- .» a. *u- • . *u « » • 

, , , . , , . s2jl] references the g in glint . At this point, the g in 

addressed below under separate headings. «!• ./■ . • i\ . i_ .i_ « » • « r » /• 7 • 

Case 1- General Uonrade Process "glint (i.e., slnng s2) matches the g in chng (i.e., stnng 

case l. General Upgrade Process ...... sl). According to the above code, the substring variable 

The general upgrade process is described with reference i , j-cim i • • * «„.u^* j-- . nr* . n , 1 »• «,u.* A t. :« 

. nr^v a c » ci .■ u j , substr[5][l] is given a value "substr[i+l][j+l]+l , which in 

to FIGS. 4-6. The upgrade file creation phase is described 30 ... • . 4 r, nrn1 N JLJ J 

. . t „.„ ^ ... . . , . -* u this case is substr[6][2]+l, or one. Hence, a value of one is 

with respect to FIG. 4, while the reconstruction phase is . . , . . . , , J c 4 , . ,j-ctti i 

addressed in FIG 6 inserted into the table for the entry substr[5][l ]. 

^„ A . ' ' i i r- , FIG. 5a shows a two-dimensional table 120 being indexed 

FIG. 4 shows steps in a method for constructing an ... ... * /• « r ».\ j .u . ■ 

, C1 . j i i i . r r . . .u by the characters in stnng sl (i.e., cung ) and the stnng s2 

upgrade file to upgrade the old version of software to the ,/ u ., x . & L v ' , . , & ' n 4 , ° # 

K& • r r„ -m. f ^ u 0-C, glint ). FIG. So shows the table 120 with the last 

new version of software. The steps are performed by 35 v . „ . .■ r.u . 

. . , • , . ■ j • .u j column filled in. The value one at the intersection of the two 

computer-executable instructions contained in the upgrade M . . . , . ... ,u . .u * •« ^ .u^. ■ 

F . . . r6 "gsin the character strings indicates that there is a subslnng 

program 36 at the upgrade server 24. , . . , , . . ... _ . . . . 

r . iL j c one character in length that begins at position 5 in the slnng 

At step 100, the upgrade program 36 compares images of w . . . * , b t .f. . - 

, . r . j . t r • ° TU sl (i.e., the g^ m cling ) and at position 1 in string s2 (i.e., 

the old program version and the new program version. The , V, , . „ *?. to 7 r ° v 

, ( . , . u th e 6 m glint ). 

upgrade program treats the images as raw data or character 40 . P 7 „.„ _ , • , . . 

, . i , i a i* i ,l_ „i i ™„ Mm The position counter i for stnng sl is then decremented 

strings, and not as code. Accordingly, the old program v , , , to . . . 

. . , . , . ■ j .. to four, and the process eye es again through the position 

version is seen as one large character stnng, and the new ' ' " * r n » . . . .. 

program version is seen as a different large character string. counter j forstnngs2 from five to one. In this case, the n 

At step 102, the substring matching module 84 finds all of characters in each string match at j-4 and i-4 A^rdingly, 

the substrings that the old and new character strings have in 4S ^ k subs ^ 4 W 15 V™ substr[4 + ll4 + l] + l , 

common given different starling points in the two strings. which m lhis case 15 substr[5l5] + l, or one. 

More particularly, the substring matching module 84 derives WheD the P osilion " r is decremented to three, the 

the length of each longest common substring beginning at a " ! " characters in each stnng match at j-3 and I i-3 In this 

first position in the old character string and at a second case > entrv substr[3][3] is given a value "substr[3+l][3 + l]+ 

position in the new character string. The length can be any 50 r > or two - n * value two indicales tha ! the , re * a substnng 

value from 0 to many characters. lwo characters long that begins at position 3 in the string sl 

More succinctly, the substring matching module 84 finds, 0^;. thc " j " in " clin g M ) and at P 0Slll0D 3 10 slrm e s2 

for any two strings, sl and s2, and two offsets into the * n ' S^ nl )• 

strings, pi and p2, the longest common substring starting at F1 G. 5b shows the two-dimensional table 120 with the 

offset pi in string sl and offset p2 in string s2. The substring 55 entries in the last three columns filled in. 

matching module 84 runs a process that works backwards in The process continues for values i=2 and i=l. FIG. 5c 

thc two strings from their ends to their beginnings, comput- shows the two-dimensional table 120 with all of the entries 

ing longest substrings in terms of earlier computed longest filled in. Notice that entry substr[2][2] has a value three, 

substrings. The process is embodied in the following code: indicating that a common substring of three characters in 

60 length begins at posilion 2 in string sl and position 2 in 

for i=l to length(sl) string s2. 

substifi Jlength(s2)+1]=0; At step 104, the substring matching module ascertains the 

longest common substring for a given position pos2 in the 

for j=l to lcngth(s2) string s2 (i.e., the new software version). This step is 

substr[length(sl)+l][j]=0; 65 performed as follows: 

for i-length(sl) down to 1 [maxRunLen pos2]={max subsir[i][pos2]:l<i<length(sl)} 
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The byte couot matched is maxRunLen[pos2] and the 
pointer into string si is the i such that substr[iJpos2] is 
maximized. Using the above example, the maximum sub- 
string beginning at position 3 in the second string "glint" is 
two. The maximum substring beginning at position 2 in the 
second string "glint" is three. 

After the upgrade program has found common substrings, 
it can distinguish between matching sections of the program 
versions (i.e., substrings that are present in both versions of 
the software) and non-matching sections (i.e., substrings in 
the new version, but not in the old version). The upgrade 
program can begin building the upgrade file and demarcat- 
ing the two different types of sections. That is, the upgrade 
program places a token or header at the beginning of each 
section to designate the type of section. 

At step 106, for matching sections, the upgrade program 
36 inserts corresponding pointer headers in the upgrade file 
in lieu of the common substrings. The pointer headers 
reference corresponding locations in the old version of the 
program at which the common substrings reside. These 
pointer headers convey essentially the same information 
about the new version as if the entire common substrings 
were reproduced in the upgrade file and hence, the common 
substrings are omitted from the upgrade file. However, since 
the pointer headers are smaller in size than the common 
substrings they represent (and often times, substantially 
smaller), the substitution of pointer headers for long com- 
mon substrings helps compress the update file to a size 
smaller than the new version. 

At step 108, for non-matching sections, the upgrade 
program inserts corresponding data headers into the upgrade 
file. The data headers indicate that the accompanying data 
arc new sections and arc not found in the old program 
version. The upgrade program also copies the new sub- 
strings into the upgrade file in association with their data 
headers (step 110). 

The pointer and data headers contain four fields. The first 
field is a one-bit flag that identifies the header as either a 
pointer header or a data header. The second field is a two-bit 
count of the additional bytes necessary to represent the 
amount of data in the section. The third field contains a data 
length indicating the number of bytes in the corresponding 
section. The third field ranges from five to twenty-nine bits. 
The fourth field contains an offset value indicating the 
number of bytes into the old version to locate the start of a 
common substring. The fourth field is used only for the 
pointer header (24-bits), and is null in the data header. Table 
1 summarizes the header types. 



TABLE 1 



Token type 


Type flag 


Byte count 


Data Length 


File offset 


Pointer 


1 bit 


2 bits 


5 to 29 bits 


24 bits 


Data 


1 bit 


2 bits 


5 to 29 bits 


none 



The byte count and data length fields allow efficient 
representation of both short and long runs. Most runs are 
short, so many of the bits in the data length field are not 
needed. For long runs, however, a large bit value can be 
stored in the length field. With the two-bit byte count, the 
length field can occupy five, 13, 21, or 29 bits, as necessary. 

Accordingly, the header occupies one to four bytes for 
data sections and four to seven bytes for reference sections. 
The compressed upgrade file ends up comprising many 
sections, each of which is either an old section (which is 
replaced with a four-byte to seven-byte pointer into the old 
version) or a new section of raw data (which is demarcated 
by a one-byte to four-byte pointer). 
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Since the pointer header ranges from four to seven bytes, 
one optimization to the process described above is that it 
only finds common substrings that are greater than this 
pointer header length, such as eight bytes. These longer 

5 substrings are then replaced with pointer headers in the 
compressed update file. Common substrings that are less 
than eight bytes may be copied directly into the upgrade file 
in less space than would be consumed by a corresponding 
pointer header. This optimization precludes inclusion of 
pointer headers for short substrings on the order of only a 
few characters (e.g., eight bytes or less). 

At step 112, the upgrade server 24 downloads the upgrade 
file 38 to the client 22 over the network 26. The client 22 in 
turn uses the upgrade file 38 to upgrade the old version of the 
program to the new version. 

35 FIG. 6 shows steps in a method for reconstructing the new 
software version from the upgrade file and old version. The 
steps are performed by computer-executable instructions 
stored in memory at the client. Alternatively, the steps might 
be performed by specific hardware components at the client 

20 that contain hardwired logic for performing the steps, or by 
any combination of programmed computer components and 
custom hardware components. 

At step 130, the client receives the upgrade file 38 from 
the upgrade server 24 and stores the file in RAM 54. The 

25 client 22 processes the upgrade file 38 section by section, 
according to the headers it encounters (step 132). The 
client's operating system 30 is configured to perform the 
upgrade procedures to reconstruct the new version of the 
program from the old version and the upgrade file. The 

30 reconstructed new version is stored in the flash memory 58. 
For any section in the upgrade file that is demarcated by 
a pointer header, the client copies the longest common 
substring referenced by the pointer header from the old 
version of the program stored in flash memory 58 into the 

35 new version being reconstructed (step 134). The file offset 
value in the pointer header locates the start of the common 
substring and the byte count informs the client of the length 
of the common substring. The client allocates sufficient 
space in the 11 ash memory to hold a section as large as the 

40 byte count indicates, and then copies in the section. 

For any section in the upgrade file that is demarcated by 
a data header, the client adds the new section included in the 
upgrade file into the new version being reconstructed (step 
136). The client uses the data header's byte count to detcr- 

45 mine the size of the ensuing new section. The client allocates 
sufficient space to accommodate the new section and writes 
the new section into the reconstructed new version. 

The client continues through the upgrade file, header by 
header and section by section. When the client has finished 

50 processing the upgrade file, the client informs the user of a 
new version of software and prompts the user as to whether 
he/she would like to upgrade to the new version. If so, the 
client reboots using the new version stored in flash memory. 
Case 2: Modified Upgrade Process with Hash Table 

55 The modified upgrade process improves the speed of the 
basic process for creating an upgrade file by using a hash 
table. A hash table is a data structure that allows efficient 
lookup of values in a large data set. A hash function maps the 
key values from a large range to a smaller range, which can 

60 be chosen arbitrarily. 

FIG. 7 shows steps in another method for constructing an 
upgrade file to upgrade using the modified process. The 
steps are performed by computer-executable instructions 
contained in the upgrade program 36 at the upgrade server 

65 24. 

A preliminary step 150 in the modified upgrade process 
involves hashing every possible group of k contiguous bytes 
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in the old file version into a large hash table. The upgrade 
program 36 employs the hashing algorithm 86 to hash bytes 
0 to k, then bytes 1 to k+1, and so on. After the hash table 
is constructed, the upgrade program 36 evaluates the new 
software version against the old software version. The 
upgrade program 36 employs the hashing algorithm 86 to 
hash every possible group of k contiguous bytes of the new 
software version (step 152). The upgrade program 36 deter- 
mines whether the k-byte run of the new version hashes to 
a value in the hashing table (step 154). If not, the upgrade 
program 36 proceeds to the next k-byte run in the new 
version (step 156); otherwise, the upgrade program 36 
compares the old and new versions that begin with the 
common k-byle run (step 158). 

In this manner, only k-bytes runs of the new version that 
hash to the same value as a k-byte run of the old version are 
compared. With a carefully chosen hash function, table size, 
and value of k, the number of such words can be kept small. 
As one example, the value k can be set to the threshold 
number of bytes for replacement with a reference header. 
For instance, if the process is optimized to find common 
substrings that are greater than eight bytes, the value k 
should be set to eight bytes to minimize the number of runs 
that hash to the same value. 

At step 160, the upgrade program 36 ascertains the 
longest common substring beginning at a position in the new 
version and an offset into the old version where the corre- 
sponding matching run begins. Another efficiency improve- 
ment stems from an observation that in the basic compres- 
sion process, every row of the table depends only upon the 
previous row. Thus, if the maximal runs arc calculated on the 
fly instead of in a separate pass, only the current row and 
previous row need to be in memory, rather than the entire 
table. The hashing module 86 implements the following 
code. 

for i«l to length(sl) 

insert <Hash(sl[i]), i> into hash table 

for j=lcngth(s2) down to 1 

for each <c, i> in hash table such that c=Hash(s2[j]) 

if(st[i]=s2[j]) 

if (i=length(sl)) or (j«length(s2)) or (sl[i+l]*s2[j + 
1]) 

curRow[i]=l; 
else 

curRo w[i ]=prevRowp+ 1 ]+ 1 ; 
if (curRow[i]>maxRunLen[j]) 
m axRun Lenjj ]=curRo w[i] ; 
m axRunOffse t(j ]=/; 
Swap(prevRow, curRow); 

When this code terminates, maxRunLenJj] contains the 
length of the maximal run beginning at offset j in File2, and 
maxRunOffseuj] contains the offset of the matching run in 
Filel. 

For a good performing hash table (i.e., one with a properly 
selected hash function, table size, and value of k), the code 
executes in time proportional to the sum of the lengths of the 
two files. This is a significant improvement over the original 
code described above in Case 1, which runs in time propor- 
tional to the product of the two lengths. Thus, while the 
original algorithm would have taken days to run on files of 
a few megabytes, the improved algorithm handles such files 
in seconds. 
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Steps 162-168 are similar to steps 106-112 of FIG. 4. 
Case 3: Upgrade Process for NK Image File 

The upgrade processes defined in cases 1 and 2 are well 
suited for two files (i.e., the old and new versions) that share 
5 a lot of data in common. Another aspect of this invention, 
however, concerns use of image files that are at least partly 
compressed. One exemplary file type, known as NK (new 
kernel) image files, is designed for sending large amounts of 
data from a server to a client and then unpacking the data to 
1Q the correct locations on the client. NK image files have a 
specific format, beginning with a fifteen-byte header, 
defined by the following structure: 

struct _ROMIMAGE_HEADER { 
UCHAR Signature^]; 
35 ULONG PhysicalStartAddress; 
ULONG PhysicalSize; 

}; 

The data sections follow this header. Each data section has 
20 its own header, defined by the following structure: 

struct _ROMIMAGE_SECTION { 
ULONG Address; 
union { 
25 ULONG Size; 

ULONG EntryPoint; 

}; 

ULONG Checksum; 

}; 

30 

The "Size" field indicates the size of the section in bytes 
and the "Address" field indicates the destination location for 
those bytes on the client. After "Size" bytes, there is another 
ROMIMAGE_SECTION structure defining the next data 

35 section. The NK image file can contain an arbitrary number 
of data sections. At the conclusion of the data sections is a 
final ROMIMAGE__SECTION structure with an "Address" 
field of zero to indicate the end of the file. 

Although the NK image file structure is rather simple, the 

40 various sections of the NK images cannot be directly used in 
the modified compression algorithm described above under 
the "Case 2" heading because each of these sections, or even 
subparts of each section, may already be compressed using 
an LZ compression algorithm. Because one of the byprod- 

45 ucts of a good data compression algorithm is apparent 
randomness in the resulting data, two very similar files may 
in fact lose all similarity once they have been compressed. 

As a concrete example, suppose the old image file con- 
tains the sentence "Mary had a little lamb", and the new 

50 image file contains "John had a little Iamb". When creating 
an upgrade file, the upgrade program 36 replaces the new 
sentence with a data run of four bytes for "John" and a 
reference run for the remaining 18 bytes, which are identical 
in both image files. Because the data header occupies one 

55 byte, and the reference header occupies four bytes, the 
compressed version is only nine bytes long (i.e., one-byte 
data header, four-byte data run for John, and four-byte 
reference header). This is just over one-half as large as the 
original file. 

60 Now suppose that the two sentences are LZ compressed. 
In general, there will be no similarity between the resulting 
sentences, despite a high degree of similarity in the origi- 
nals. LZ compression may reduce the size of the sentence 
from 18 bytes to 12 bytes, for example, but the algorithm 

65 used to find common substrings will not be able to reduce 
that size any further (and may actually increase it by a byte 
if we account for the data header). 
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To resolve this conflict between the LZ compression 
algorithm and the process used in creation of a compressed 
upgrade file, the upgrade program parses through the old and 
new NK image files and decompresses each compressed 
section. Afterwards, each section can be processed using the 
methods described in FIG. 4 or FIG. 7. 

FIGS. 8a and 8b show steps in a method for constructing 
an upgrade file to upgrade an old NK image file to a new NK 
image file. At step 180, the upgrade program first locates the 
compressed sections in the two image files. Fortunately, 
each NK image file contains a tabic of contents that leads to 
this information. The table of contents has the following 
structure: 



struct ROMHDR { 



}; 



ULONG 


dllfixst; 


ULONG 


dlllast; 


ULONG 


phys first; 


ULONG 


physlast; 


ULONG 


n urn mods; 


ULONG 


ulRAMStart; 


ULONG 


ulRAMFree; 


ULONG 


ul RAM End; 


ULONG 


ulCopyEntries; 


ULONG 


ulCopyOflset; 


ULONG 


ulProfileLen; 


ULONG 


ulProfilcOffsct; 


ULONG 


numfilcs; 


ULONG 


ulKernclFlags; 


ULONG 


ulFSRamPcrccnt; 


ULONG 


ulDrivglobStart; 


ULONG 


ulDrtvglobLcn; 


ULONG 


ultntrStackStart; 


ULONG 


ulIntrStackLen; 


ULONG 


ulTrackingStart; 


ULONG 


u ITra eking Len; 



15 



20 



25 



struct TOCentry { 



DWORD 

FiLirriME 

DWORD 

LPSTR 

ULONG 

ULONG 

ULONG 



dwFiJe At tributes; 

ftTunc; 

nFUeSizc; 

lpszFLleName; 

ulE320ffiicl; 

ul0320ffset; 

ulLoadOffset; 



Entry "ulE320flset" is a pointer to an E32 structure, and 
entry "ul0320£fcet"is a pointer to the first 032 structure for 
the module. The E32 structure has the following format: 55 



struct e32_rom { 
unsigned short 
unsigned short 
unsigned long 
unsigned long 
unsigned short 
unsigned short 
unsigned long 
unsigned long 
unsigned short 



e32_objcnl; 
e32_image flags ; 

e32 entryrva; 

e32_vbase; 

e32_subsysmajor; 

e32_subsysminor; 

e32_ stackmax; 

e32_vsize; 

e32_subsys; 



60 
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-continued 



struct info e32_unit [ROM_EXTRA]; 



}; 



The "e32_object" entry contains the number of 032 
structures for the module. The 032 structure has the fol- 
lowing format: 



10 



struct o32_rom { 




unsigned long 


o32_vsizc; 


unsigned long 


o32_rva; 


unsigned long 


o32_psize; 


unsigned long 


o32„dalaptr; 


unsigned long 


o32_realaddr; 


unsigned long 

}; 


o32_flags; 



Within this structure, the upgrade program can check 
whether the section is compressed by looking at "o32_ 
flags". If (o32_flags & 0x00002000=1), the section is 
compressed. Entry "o32_psize" is the compressed size and 
entry "o32_vsize" is the uncompressed size. Entry u o32_ 
dataptr" is a pointer to the section's data. 

After the TOCentry structures for the modules are FILE- 
Sentry structures for the files. These structures have the 
following form: 



30 



struct RLEScntry { 

DWORD dwFileAttributes; 
FILETIME fiTtine; 



The key entries are "nummods", which is the number of 
modules in the image, and "numfiles", which is the number 
of files. Immediately following the table of contents are 40 
TOCentry module entries, which have the following form: 



DWORD 
DWORD 
LPSTR 
ULONG 



nRealFUeSize; 
nCompFileSize; 
IpszFileName; 
ulLoadOffset; 



45 



50 



If (dwFileAttributes & 0x00000800=1), the file is com- 
pressed. Entry "nCompFileSize" is the compressed file size 
and entry "nRealFilcSizc" is the real file size. Entry 
"ulLoadOffset" is a pointer to the file's data. Thus, by 
reading through the structures described above, the upgrade 
program can determine which sections in the old and new 
NK images are compressed. 

At step 182 in FIG. 8, the upgrade program 36 reads 
through the old NK image file and constructs a table of the 
compressed regions in the file. Each table entry has the 
following form: 

struct _COMPR_RGN 
{ 

UINT32 iAddress; 
UINT32 cBytesCompressed; 
UINT32 cBytesUncompressed; 

}; 

Next, at step 184, the upgrade program 36 reads through 
the entire old image and creates a decompressed version of 
the old image file. During this read-through, the upgrade 
program 36 performs a number of tasks, including removal 
of the ROMIMAGE_HEADER and ROMIMAGE_ 
SECTION structures (step 186), decompression of each 
compressed region (step 188), and insertion of spacer char- 
acters between the various regions (step 190). The spacers 
help avoid creation of reference runs that cross region 
boundaries. The upgrade program also creates a translation 
table (step 192), in which each entry has the following form: 
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struct _TranslationEntry 

{ 

UINT32 ipacked; 
ADDRESS iUnpacked; 

}; 

Entry "iPacked" Ls the offset into the version of the old 
image file that is created, with the headers removed and all 
sections decompressed. Entry "iUnpacked" is the corre- 
sponding client destination address. The ADDRESS struc- 
ture is defined as follows: 

struct _ADDRESS 
{ 

UINT32 iAddr; 
UINT32 iOffset; 

}; 

For an uncompressed region, entry "iOffset" in the 
ADDRESS structure is Oxffffffff, and entry "iAddr" is the 
actual client address. For a compressed region, however, 
entry "iAddr" is an index into the table of compressed 
regions formed in step 182 and entry "iOffset" is the offset 
into the decompressed version of that region. 

As an example, suppose that the data from 0x9f420000 to 
0x9f421000 is compressed, with decompressed size 2000, 
and the data from 0x9f421000 to 0x9f422000 is uncom- 
pressed. Byte 1500 in the decompressed region is referenced 
with "iAddr" zero, indicating the first compressed region in 
the table, and "iOffset" 1500. Address 0x9f422800 is 
referred to as "iAddr" 0x9f422800 and "iOffset" Oxffffffff. 

At step 194 in FIG. 8, the upgrade program 36 reads 
through the new NX image file and constructs a table of the 
compressed regions in the file. Then, at step 196, the upgrade 
program 36 reads through the entire new image file a second 
time and creates a decompressed version of the new image. 
The upgrade program decompresses every compressed part 
of the entire image file (step 198), while leaving enough 
information to recompress the file back to its original stale. 
This recompression information is in the form of compres- 
sion commands that describe which sections should be 
compressed to yield the original image. 

At step 200, the upgrade program writes the compressed 
region table of the old image at the beginning of the 
decompressed version of the new image file. In addition, the 
upgrade program writes the number of bytes in the com- 
pressed and uncompressed versions of the sections (step 
202). As a result, the decompressed version of the new 
image is the same as the original version, with the following 
exceptions: 

1. No compressed data remain. 

2. The compressed region table for the old image is 
written at the beginning of the decompressed version of the 
new image file. 

3. After each ROMIMAGE_SECTION structure, the 
uncompressed size of the section, which is equal to or 
greater than the Size field of the ROMIMAGE__SECTION 
structure, is written. 

4. If the uncompressed size is greater than the size in the 
ROMIMAGE_SECTION structure, some part of the sec- 
tion must have been compressed. Thus, the process writes a 
number, the count of compression commands, and then a 
series of compression commands. Each of these has the 
following structure: 

struct _COMPR_CMD 
{ 
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UINT32 cBytesCompressed; 
UINT32 cBytesUncom pressed; 

}; 

5 The upgrade program 36 writes the number of bytes in the 
compressed version of the section and the number of bytes 
in the uncompressed version into the "COMPR_CMD" 
structure. If the two byte counts are equal, the section is not 
compressed. If they differ, the section will require LZ 

30 compression on the client side. 

Now, it should be clear that given the correct LZ com- 
pression program, the original image file can be recreated 
from the new file. 
At this point, the upgrade program 36 processes the 

15 decompressed versions of the old image file and the new 
image file to create an upgrade file (step 204 in FIG. 86). The 
upgrade program uses the modified upgrade process 
described above in "Case 2" and a hash table containing 
decompressed version of the old image file. Essentially all 

20 data is compressed using the data and reference runs, as 
described above, with one exception. The program writes 
the ROMI M AG E_H E AD ER , ROMIMAG EJECTION, 
and COMPR_CMD data directly into the upgrade file 
without trying to compress them as these data almost 

25 certainly will not be found in the old image (step 206). 
One modification is made to the pointer token type to 
distinguish between copies from compressed and uncom- 
pressed regions. Table 2 shows the fields in the new tokens 
as follows: 

30 



TABLE 2 



40 



Token type 


Type 


Compi 


Byte 
r count 


Data length 


Region 


Offset 


Pointer to 


1 bit 


1 bit 


2 bits 


4 to 28 bits 


8 to 24 


24 


compressed 










bits 


bits 


Pointer to 


1 bit 


1 bit 


2 bits 


4 to 23 bits 


none 


24 


uncom- 












bits 


pressed 














Data 


1 bit 


none 


2 bits 


5 to 29 bits 


none 


oone 


The new 


fields 


are " 


Compr, 


" which is 


a flag 


indicating 



whether the region is compressed, and "Region," which is a 
pointer into the compressed region table. If the number of 
compressed regions Ls less than 256, one byte is ased; if the 
number of compressed regions is less than 65536; two bytes 
arc used; otherwise, three bytes arc used. 

Because the tokens can potentially be longer, the param- 
eters for determining when to replace runs can be adjusted. 
For instance, the program may only replace runs of at least 
ten bytes with pointers. This is an adjustable heuristic, which 
has no affect on decompression. 

The structure of the compressed version of the new image 
file is as follows: 

1. One byte indicating whether the new image is com- 
pressed. If this byte is zero, the image is uncompressed. If 
the byte is no n -zero, the image is compressed. 

2. The table of the compressed regions in the old image. 
Three bytes indicate the size of the table, and then a series 
of COMPR_RGN structures from the table itself. 

3. The ROMIMAGE_HEADER. 

4. The following components arc repealed for each sec- 
tion: 

a. A ROMIMAGE_SECTION structure 

b. The uncompressed size of the section. 

c. If the uncompressed size is not OxfTfEirff, the data are 
compressed by the process. Otherwise, the actual sec- 
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lion data remain. Some sections may grow as a result 
of the algorithm because the sections are first LZ 
decompressed before running our process. For some 
sections, the LZ compression may be better than our 
compression, so we leave those sections LZ com- 
pressed. 

d. If the uncompressed size is greater than the Size in the 
ROMIMAGE__SECTION structure, the number of 
compression commands and the commands them- 
selves. 

5. Finally, a ROMIMAGE_SECTION structure with the 
Address set to zero. 

Thus lakes care of the server side phase. In the client-side 
reconstruction phase, the client can optionally read the 
compressed file in its entirety and then decompress it, or 
decompress section by section. 

FIG. 9 shows steps in a method for reconstructing the new 
software version from the upgrade file at the client. The steps 
arc performed by computer-executable instructions stored in 
memory at the client. Alternatively, the steps might be 
performed by specific hardware components at the client 
that contain hardwired logic for performing the steps, or by 
any combination of programmed computer components and 
custom hardware components. 

At step 220, the client reads the first byte to determine 
whether the file is compressed. If it is zero (i.e., the "yes" 
branch from step 222), the image is uncompressed and the 
client simply reads the rest of the file and treats it as a normal 
NK image (step 224). If the first byte is non-zero (i.e., the 
"no" branch from step 222), the image is compressed and the 
client continues through the following steps. 

At step 226, the client reads the table of compressed 
regions and stores it in memory. The table is used later to 
handle the reference runs. The client reads the 
ROMIMAGE__HEADER (step 228). Then, at step 230, the 
client reads in the ROMIMAGE_SECTION structures one- 
by-one until reaching a last structure that has an Address 
field of zero. For each section, the client reads the uncom- 
pressed size (step 232). If the uncompressed size is Oxfffffffr 
(i.e., the "yes" branch from step 234), the client reads in the 
entire section as data (step 236), Otherwise (i.e., the "no" 
branch from step 234), the client reads tokens one-by-one 
and copies the data from the compressed file for data tokens 
or from the old image file for copy tokens (step 238). The 
process stops when the uncompressed data size is reached. 

The client evaluates the uncompressed image after read- 
ing through all ROMIMAGE„SECTION structures (i.e., 
the "yes" branch from step 240). At step 242, the client 
determines if the uncompressed size is larger than the 
compressed size. If so, the client reads in the compression 
coramaods and LZ compresses the new image to re-create 
the correct image file (step 244). 

The above procedure can be run as described assuming 
the client has sufficient memory to hold the entire uncom- 
pressed old image file and the largest uncompressed section 
of the new image file. With less memory, the client can LZ 
decompress regions of the old image file on demand when 
they arc needed. As a bare minimum, the client requires 
enough memory to hold the largest uncompressed section of 
the new image file and the largest uncompressed section of 
the old image file. As more memory is available, more of the 
old image file can be stored in its uncompressed state, and 
fewer times are needed to run the LZ decompression algo- 
rithm. 

Although the invention has been described in language 
specific to structural features and/or methodological steps, it 
is to be understood that the invention defined in the 



>5,125 Bl 

16 

appended claims is not necessarily limited to the specific 
features or steps described. Rather, the specific features and 
steps are disclosed as preferred forms of implementing the 
claimed invention. 
5 What is claimed is: 

1. A method for constructing an upgrade file to upgrade 
from an old version of software to a new version of software, 
comprising the following steps: 

distinguishing between matching sections that match in 
10 both the old version and the new version from non- 

matching sections in the new version that have no 

match in the old version; 
for matching sections, inserting in the upgrade file a first 

token identifying a corresponding section in the old 
15 version that matches a section in the new version; and 
for non -matching sections, inserting in the upgrade file a 

second token and the non-matching section from the 

new version. 

2. A method as recited in claim 1, wherein the first token 
20 comprises a header with at least one field indicating a 

number of bytes contained in the corresponding section in 
the old version. 

3. A method as recited in claim 1, wherein the first token 
comprises a header with at least one first field indicating a 

25 number of bytes contained in the corresponding section in 
the old version and at least one second field holding an offset 
value into the old version to locate the corresponding 
section. 

4. A method as recited in claim 1, wherein the second 
30 token comprises a header with at least one field indicating a 

number of bytes contained in the non-matching section in 
the new version. 

5. A method as recited in claim 1, wherein: 

the first token comprises a byte count indicating a number 
35 of bytes needed to represent an amount of data in the 
corresponding section, a data length indicating a num- 
ber of bytes in the corresponding section, and an offset 
value into the old version to locate the corresponding 
section; and 

40 the second token comprises a header with the byte count 
and the data length. 

6. A method as recited in claim 1, further comprising the 
step of identifying the matching sections by performing the 
following steps: 

comparing an old character string from the old version 
with a new character string from the new version; 

finding longest common substrings beginning at first 
positions in the old character string and second posi- 
5Q lions in the new character string; and 

for a particular second position in the new character 
siring, ascertaining the longest common substring 
beginning at the particular second position. 

7. A method as recited in claim 6, wherein the finding step 
5S comprises the step of constructing a two-dimensional table 

having multiple entries cross-indexed by the old and new 
character strings, individual entries representing a length of 
a longest common substring. 

8. A method as recited in claim 1, further comprising the 
60 step of identifying the matching sections by performing the 

following steps: 
hashing every possible group of k contiguous bytes in the 

old version to form first hash values; 
hashing every possible group of k contiguous bytes in the 
65 new version to form second hash values; and 

in an event that one of the first hash values equals one of 
the second hash values, comparing an old character 
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string from the old version that includes the k contigu- 
ous bytes forming said one first bash value with a new 
character string from the new version that includes the 
k contiguous bytes forming said one second hash value. 

9. A method as recited in claim 8, further comprising the 5 
following steps: 

finding longest common substrings beginning at first 
positions in the old character string and second posi- 
tions in the new character string; and 

for a particular second position in the new character 
string, ascertaining the longest common substring 
beginning at the particular second position. 

10. A method as recited in claim 9, wherein the finding 
step comprises the step of constructing a two-dimensional 
table having multiple entries cross-indexed by the old and 
new character strings, individual entries representing a 
length of a longest common substring. 

U. A method as recited in claim 1, wherein the old version 
and the new version are at least partly compressed, further 
comprising the step of identifying the matching sections by 
performing the following steps: 

decompressing the old version to form a decompressed 
old version; 

decompressing the new version to form a decompressed 25 
new version; 

comparing an old character string from the decompressed 
old version with a new character string from the 
decompressed new version; 

finding longest common substrings beginning at first 30 
positions in the old character string and second posi- 
tions in the new character siring; and 

for a particular second position in the new character 
string, ascertaining the longest common substring 
beginning at the particular second position. 

12. A method as recited in claim 11, wherein the decom- 
pressing steps each comprise the following steps: 

evaluating the old or new version section by section; 
identifying compressed sectioas; and 4 rj 
decompressing the compressed sections. 

13. A method as recited in claim 12, further comprising 
the step of inserting spacers between the sections. 

14. A method as recited in claim 12, further comprising 
the step of writing into the upgrade file commands enabling 45 
recompression of the compressed sections at the client. 

15. A method as recited in claim 1, wherein the old 
version and the new version are at least partly compressed, 
further comprising the step of identifying the matching 
sections by performing the following steps: 50 

decompressing the old version to form a decompressed 
old version; 

decompressing the new version to form a decompressed 
new version; 

hashing every possible group of k contiguous bytes in the 
decompressed old version to form first hash values; 

hashing every possible group of k contiguous bytes in the 
decompressed new version to form second hash values; 
and 60 

in an event that one of the first hash values equals one of 
the second hash values, comparing an old character 
string from the decompressed old version that includes 
the k contiguous bytes forming said one first hash value 
with a new character string from the decompressed new 65 
version that includes the k contiguous bytes forming 
said one second hash value. 
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16. A method as recited in claim 15, wherein the decom- 
pressing steps each comprise the following steps: 

evaluating the old or new version section by section; 
identifying compressed sections; and 
decompressing the compressed sections. 

17. A method as recited in claim 16, further comprising 
the step of inserting spacers between the sections. 

18. A method as recited in claim 16, further comprising 
the step of writing into the upgrade file commands enabling 
recompression of the compressed sections at the client. 

19. A method as recited in claim 1, further comprising the 
step of using the upgrade file to upgrade the old version to 
the new version. 

20. A computer-readable medium having computer- 
executable instructions for performing the steps as recited in 
claim 1. 

21. A method for constructing an upgrade file to upgrade 
from an old version of software to a new version of software, 
comprising the following steps: 

comparing an old character siring from the old version 
with a new character string from the new version; 

finding longest common substrings beginning at first 
positions in the old character string and second posi- 
tions in the new character string; 

for a particular second position in the new character 
string, ascertaining the longest common substring 
beginning at the particular second position; 

inserting, in the upgrade file, a pointer header representing 
the longest common substring in lieu of inserting the 
longest common substring, the pointer header indicat- 
ing a corresponding position in the old character string 
at which the longest common substring begins; 

iaserting, in the upgrade file, characters from the new 
character string that are not included in the longest 
common substring; and 

placing a data header indicating that the characters being 
inserted are not matched in the old character string. 

22. A method as recited in claim 21, wherein the finding 
step comprises the step of constructing a two-dimensional 
table having multiple entries cross-indexed by the old and 
new character strings, individual entries representing a 
length of a longest common substring beginning at a first 
position in the old character string and a second position in 
the new character string. 

23. A method as recited in claim 21, wherein the pointer 
header includes a byte count indicating a number of bytes 
contained in the longest common substring. 

24. A method as recited in claim 21, wherein the pointer 
header includes a byte count indicating a number of bytes 
contained in the longest common substring and an offset 
value identifying a corresponding first position in the old 
character string at which the longest common substring 
begins. 

25. A method as recited in claim 21, wherein the data 
header includes a byte count indicating a number of bytes of 
the characters being inserted. 

26. A method as recited in claim 21, further comprising 
the step of using the upgrade file to upgrade the old version 
to the new version. 

27. A method for constructing an upgrade file to upgrade 
from an old version of software to a new version of software, 
comprising the following steps: 

hashing every possible group of k contiguous bytes in the 

old version to form first hash values; 
hashing every possible group of k contiguous bytes in the 

new version to form second hash values; 
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in an event that one of the first bash values equals one of 
the second hash values, comparing an old character 
string from the old version that includes the k contigu- 
ous bytes forming said one first hash value with a new 
character string from the new version that includes the 5 
k contiguous bytes forming said one second hash value; 

finding Longest common substrings beginning at first 
positions in the old character string and second posi- 
tions in the new character string; and 

for a particular second position in the new character 10 
string, ascertaining the longest common substring 
beginning at the particular second position. 

28. A method as recited in claim 27, further comprising 
the following steps: 

inserting, in the upgrade file, a pointer header representing 
the longest common substring in lieu of inserting the 
longest common substring, the pointer header indicat- 
ing a corresponding position in the old character string 
at which the longest common substring begins; 

inserting, in the upgrade file, characters from the new 
character string that are not included in the longest 
common substring; and 

placing a data header indicating that the characters being 
inserted are not matched in the old character string. 2 s 

29. A method as recited in claim 27, wherein the old 
version and the new version are at least partly compressed, 
further comprising the step of decompressing the old and 
new versions prior to the hashing steps. 

30. A method for upgrading software in a client remotely 30 
from an upgrade server, comprising the following steps: 

at the upgrade server, performing the following steps: 

comparing an old character string from an old version 
of software with a new character string from a new 
version of software; 35 

finding longest common substrings beginning at a first 
position in the old character string and a second 
position in the new character string; 

for a particular second position in the new character 
string, ascertaining the longest common substring 40 
beginning at the particular second position; 

inserting, in an upgrade file, a pointer header repre- 
senting the longest common substring in lieu of 
inserting the longest common substring, the pointer 
header indicating a corresponding position in the old 45 
character string at which the longest common sub- 
string begins; 

inserting, in the upgrade file, characters from the new 
character string that are not included in the longest 
common substring; 50 

placing a data header indicating that the characters 
being inserted arc not matched in the old character 
string; and 

transferring the upgrade file to the client; 
at the client, performing the following steps: 55 

receiving the upgrade file from the upgrade server; 

processing the upgrade server to reconstruct the new 
version of software; 

for any section in the upgrade file demarcated by the 
data header, adding the new characters to the recon- 60 
structed new version; and 

for any section in the upgrade file demarcated by the 
pointer header, copying the longest common sub- 
string from the old version into the reconstructed 
new version. 65 

31. A method as recited in claim 30, wherein the finding 
step comprises the step of constructing a two-dimensional 



table having multiple entries cross-indexed by the old and 
new character strings, each entry representing the longest 
common substring beginning at the first position in the old 
character string and at the second position in the new 
character string. 

32. A method as recited in claim 30, wherein the pointer 
header includes a byte count indicating a number of bytes 
contained in the longest common substring. 

33. A method as recited in claim 30, wherein the pointer 
header includes a byte count indicating a number of bytes 
contained in the longest common substring and an offset 
value identifying a corresponding first position in the old 
character string at which the longest common substring 
begins. 

34. A method as recited in claim 30, wherein the data 
header includes a byte count indicating a number of bytes of 
the characters being inserted. 

35. A software upgrading system, comprising: 

a client having a processor and a memory, the memory 
storing an old version of software; 

an upgrade server having a processor and a memory, the 
upgrade server memory storing both the old version of 
software and a new version of software, the upgrade 
server identifying a longest common substring that is 
common to both the old version and the new version 
and creating an upgrade file with a pointer header 
representing the longest common substring in lieu of 
the longest common substring, the pointer header indi- 
cating a corresponding position in the old version at 
which the longest common substring begins, and the 
upgrade file further containing at least one section from 
the new version that is not included in the longest 
common substring and a data header indicating that the 
section is new; and 

the client processing the upgrade file to reconstruct the 
new version of software by adding the new section 
corresponding to the data header and using the longest 
common substring in the old version that is identified 
by the pointer header. 

36. A system as recited in claim 35, wherein the upgrade 
server constructs a table having multiple entries, in which an 
individual entry represents a length of a longest common 
substring beginning at a first position in the old character 
string and at a second position in the new character string, 
the upgrade server ascertaining the longest common sub- 
string beginning at a particular second position. 

37. A system as recited in claim 35, wherein the upgrade 
server hashes every possible group of k contiguous bytes in 
the old version and every possible group of k contiguous 
bytes in the new version and only considers substrings that 
include k contiguous bytes that hash to the same value. 

38. A system as recited in claim 35, wherein the pointer 
header includes a byte count indicating a number of bytes 
contained in the longest common substring. 

39. A system as recited in claim 35, wherein the pointer 
header includes a byte count indicating a number of bytes 
contained in the longest common substring and an offset 
value identifying a corresponding first position in the old 
version at which the longest common substring begins. 

40. A system as recited in claim 35, wherein the data 
header includes a byte count indicating a number of bytes of 
the section being added. 

41. A computer-readable medium that stores computer- 
executable instructions for directing a computer to perform 
the following steps: 

comparing an old character string from an old version of 
software with a new character siring from a new 
version of software; 
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finding longest common substrings beginning at first 
positions in the old character string and second posi- 
tions in the new character string; 

for a particular second position in the new character 
string, ascertaining the longest common substring 5 
beginning at the particular second position; 

inserting, in the upgrade file, a pointer header representing 
the longest common substring in lieu of inserting the 
longest common substring, the pointer header indicat- 
ing a corresponding position in the old character string 10 
at which the longest common substring begins; 

inserting, in the upgrade file, characters from the new 
character string that are not included in the longest 
common substring; and 15 

placing a data header indicating that the characters being 
inserted are not matched in the old character string. 

42. A computer-readable medium that stores computer- 
executable instructions for directing a computer to perform 
the following steps: 
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hashing every possible group of k contiguous bytes in the 

old version to form first hash values; 
bashing every possible group of k contiguous bytes in the 

new version to form second bash values; 
in an event that one of the first hash values equals one of 
the second hash values, comparing an old character 
string from the old version that includes the k contigu- 
ous bytes forming said one first hash value with a new 
character string from the new version that includes the 
k contiguous bytes forming said one second hash value; 
finding longest common substrings beginning at first 
positions in the old character string and second posi- 
tions in the new character string; and 
for a particular second position in the new character 
string, ascertaining the longest common substring 
beginning at the particular second position. 

* * * * * 
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